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We introduce the concept of the area of a jet, and show how it can be used to perform the 
subtraction of even a large amount of diffuse noise from hard jets. 



1 Introduction 

Jet clustering algorithms, which map the particles observed in the final state of a high-energy 
collisions into a smaller number of (usually) well defined objects - the jets - are widely used in 
the study of the properties of strong interactions. The jets are usually meant to be good proxies 
of the original partons (though the detailed relation is more subtle), and by studying them one 
tries to probe the underlying dynamics. The reason for using the jets, rather than directly the 
observed hadrons, is that they can be construed as infrared-safe observables: they are therefore 
amenable to perturbative QCD predictions, and their sensitivity to non-perturbative phenomena 
(hadronisation, underlying event and pileup effects) can either be kept under control or corrected 
for. 

In this talk we explore the issue of the susceptibility of jets to contamination from soft 
radiation distributed in the form of a roughly uniform and diffuse background. Physical exam- 
ples are the pileup originated by multiple minimum bias collisions in high-luminosity hadron 
colliders like the LHC, the many particles produced in a central heavy ion collision and, to a 
lesser extent, the underlying event given by perturbative and non-perturbative QCD radiation 
whenever strongly-interacting particles are produced at high energy. We shall argue that this 
susceptibility can be quantitatively characterised in terms of the novel concept of area of a jet, 
which we shall rigorously introduce. In turn, this will suggest a procedure by means of which 
such contamination can be subtracted from the jet momentum, so as to recover - to a large 
extent - its proxy relation with the parton it originated from. 



"In collaboration with Gavin Salam and Gregory Soyez. Presented at Moriond QCD, La Thuile, Italy, March 
2007. To appear in the Proceedings. 
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Active area distributions for the kt algorithm . Cambridge/ Aachen has a very similar behaviour, 
are of jet containing a hard particle as a function of the ratio of its momentum to that of the soft 

background jets. 



Naively, one can think of the jet area as the surface (in the rapidity- azimuth plane) over 
which the particles that have been clustered into a given jet are distributed. One can also 
assume that the amount of diffuse background radiation clustered together with the jet will be 
proportional to this area. One could therefore think of determining somehow the momentum 
surface density of this noise, p, and successively subtract from the jet momentum a quantity 
given by p times the area of the jet. 

Before such a program can be implemented in practice, however, the jet area needs to be 
defined more rigorously, and a procedure to extract p must be devised. This is done inffl andl^l 
respectively, where both aspects are introduced and extensively studied. 



2 Jet Area 

The naive vision of the jet area as the surface covered by the particles that make up the jet 
quickly turns out to be fallacious: as the particles are point-like, this area is zero. Drawing some 
sort of boundary, like for instance a convex hull - the minimal set of particles such that all the 
others are contained in the polygon drawn through them ~ is also prone to ambiguities: different 
jets may overlap, and a region of space might be arbitrarily assigned to a jet irrespectively of 
the properties of the clustering algorithm. 

To overcome these difficulties, we propose a definition of jet area which is inherently related 
to the clustering procedure, and which can properly account for the jet contamination due to 
a diffuse background. Our definition is strictly dependent on the infrared-safety property that 
a good jet algorithm should have: the addition of one (or many) soft particles to the event 
should not change the final set of hard jets. We add therefore a large number of uniformly 
distributed and extremely soft particles (ghosts) to the event, and cluster them together with 
the real particles. At the end of the clustering procedure, the number of ghosts clustered with 
each jet will provide a robust measure of the jet's extension in the rapidity-azimuth plane, and 
define therefore its active area, 

Fig. lU^a) shows how the values for this active area are distributed for two kinds of events: on 
one extreme, jets constituted of many uniformly distributed particles with similar momenta (the 
pure-ghost jets); on the other extreme, a jet containing a single hard particle. We can see that 
these two situations produce different distributions for the active areas, with different averages: 

''The drawback of this procedure is that a very large number of particles needs to be clustered (a few thousands 
ghosts are needed to achieve accuracies of the order of one per cent). This would be unfeasible - or at least 

extremely unpractical - without the fast implementations of the fet^and the Cambridge/Aachenl^ljet algorithms 
m 

provided by Fast Jet"^. This package also provides the tools to calculate the area of the jets, as well as an interface 
to the new infrared-safe cone algorithm SlSCone^. 
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Figure 2: A dijet event superimposed to 10 minimum bias events originated by moderate-luminosity pileup in pp 

collisions at the LHC, as simulated by PYTHIA. 

the jets containing many similar particles have a typical area of order {A^"-^^) ~ 0.55 vri?^ {R is 
the typical radius parameter present in most jet algorithms), while the jets containing a single 
hard particle tend to be larger, their average area being (^A^"'^'^) ~ 0.81 vri?^. 

One can take farther this exploration of similarities and differences between soft (i.e. uni- 
form) and hard jets, and explore how the transition takes place: fig. [T]^b) shows the average area 
of the jet containing the single "hard" particle as its transverse momentum pt changes from be- 
ing negligible with respect to the soft background to being much larger. One can see that in the 
Pt,hard ^ {pt,soft jet) limit the ~ 0.81 ttR'^ value for the average area is recovered. On the other 
hand, in the opposite Pt,hard ^ {pt,soft jet) limit the "hard" jet now behaves like a soft one, the 
difference in average area being only of probabilistic nature related to the "measurement" of the 
area of the specific jet containing a given particle. 

3 Noise Level 

The estimation of p, the typical level of the background radiation, could probably be performed 
in many ways. The method we propose here is related to the jet areas discussed above. It relies 
on the observation that the transverse momentum of a jet divided by its area, pu/Ai, behaves 
differently for the hard jets and for the background ones. Typically, the jets originating from the 
background radiation cluster themselves in a band, while the hard jets stick out. This is clearly 
shown in fig. [2j This event is a simulated pp collision at the LHC at moderate luminosity: 10 
additional minimum bias events are added to the main hard collision, which produces a dijet 
event with jets of transverse momentum of the order of 50 GeV. Fig. [2] (left) shows that the areas 
of the various jets can fluctuate widely. However, when the same jets are plotted in terms of 
Pti/Ai (right plot) one clearly see the band established by the background. Different strategies 
can be devised to quantitatively determine its level. One of the simplest one is to take the 
median of all the pu/Ai, an operation that prevents the few hard jets from biasing its value. We 
define therefore: 

(1) 

In the specific case of the event of fig. [21 the momentum density of the background is therefore 
p ~ 6 GeV per unit area. 

4 Background Subtraction 

Once the area of each jet, Ai, and the noise level p are known, one can correct the transverse 
momentum via the following operation: 

Pt""^ = Pt^ - pA . (2) 




p = median 




We show how this works in practice by considering the following toy model: we generate many 
events which contain a single hard particle, with a transverse momentum p^°-'^'^ = 100 GeV, 
embedded in a background of 10000 soft particles, each with an average transverse momentum 
{pT^*) — 1 GeV (with little fluctuations, 10%, around this value) and randomly uniformly 
distributed in rapidity and azimuth up to ymax = 4. In this particular case we can of course 
calculate the transverse momentum density (per unit area) of the soft particles from the input 
parameters, since we know how we generated them: 
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(3) 

dyd(t> / 2 X Umax X 2tt 
This situation might look extreme, but similar values are expected in realistic cases, like a central 
Pb Pb collision at the LHC. 

We know from the previous section that an 
average soft jet, when clustered with the kt or 
the Cambridge/ Aachen algorithm with R = 1, 
has an area of order O.SSvr. This translates 
in a typical transverse momentum pl°^^ — 
p(A'°f^) ~ 350 GeV. Such jets would already 
dwarf the hard particle of 100 GeV. However, 
this particle will itself be embedded in a jet con- 
taining also many soft particles: this jet will 
therefore have a typical transverse momentum 
of the order of 350 + 100 GeV, but huge fluctu- 
ations will be visible from one event to another, 
as the amount of background clustered with it 
will vary considerably. 

This means that both the absolute energy 
scale and the energy resolution are degraded by 
the presence of the background, as shown in 
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Figure 3: Jets containing a hard particle with pt_ hard ~ 
100 GeV clustered together with a soft background 
(green, "raw" histogram), and after its subtraction 
(blue "corrected" one). The '4- vector' versions of the 
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area and of the subtraction , more appropriate for 
large R, have been used for this plot. 



fig. O the transverse momentum of the hard jet is displaced, by an amount consistent with 
our estimate, and the resolution is hopelessly bad (green histogram, "raw"). However, once the 
subtraction is performed according to eq. ([2]) (using for each event the p directly extracted from 
the clustering, as explained in Sec. [3l and not the fixed value of eq. of course), the correct 
average transverse momentum is recovered, together with a large fraction of the resolution (blue 
histogram, "corrected" ) . 

This toy model shows the feasibility and the accuracy of the determination of the noise 
level and of the subtraction procedure. More realistic examples, and references to experimental 
investigations of the problem of background subtraction, are given in^. 
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